Two Word Reordering Strategies in English - to - Chinese Translation
نویسندگان
چکیده
In English-to-Chinese machine translation, reordering mistakes are frequently caused by miss-located prepositional phrases(PP). In English, a PP is often located after its attached host, while in Chinese things are the opposite. Recent phrase-based MT approaches tend to ignore such syntax information. In our work, we propose two strategies to resolve the above problem: first, for MT that adopts the lexical reordering model, we modify the reordering orientations from Monotone-Swap-Discontinuous (MSD) to MonotoneLeftDiscontinuous-RightDiscontinuous (MLR), which is more efficient in guiding the reordering direction of PP; second, for MT using the pre-reordering approach, we apply PP attachment disambiguation to find the host of each PP and then pre-reorder them precisely. The superiority of the two approaches is verified in our empirical studies. Our work has already been applied in Youdao online translation system (http://fanyi.youdao.com).
منابع مشابه
Chinese Syntactic Reordering for Statistical Machine Translation
Syntactic reordering approaches are an effective method for handling word-order differences between source and target languages in statistical machine translation (SMT) systems. This paper introduces a reordering approach for translation from Chinese to English. We describe a set of syntactic reordering rules that exploit systematic differences between Chinese and English word order. The result...
متن کاملReordered Search and Tuple Unfolding for Ngram-based SMT
In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfo...
متن کاملThe application of source language information in Chinese-English statistical machine translation
The quality of machine translation (MT) has been significantly improved by using statistical approaches. The integration of syntactic knowledge into a statistical MT system is still an open problem. This talk investigates the application of syntactic knowledge of the source language to the phrase-based MT system for translating Chinese into English. In this thesis, particular issues have been a...
متن کاملRule-Based Preordering on Multiple Syntactic Levels in Statistical Machine Translation
We propose a novel data-driven rule-based preordering approach, which uses the tree information of multiple syntactic levels. This approach extend the tree-based reordering from one level into multiple levels, which has the capability to process more complicated reordering cases. We have conducted experiments in English-to-Chinese and Chinese-to-English translation directions. Our results show ...
متن کاملTo Swap or Not to Swap? Exploiting Dependency Word Pairs for Reordering in Statistical Machine Translation
Reordering poses a major challenge in machine translation (MT) between two languages with significant differences in word order. In this paper, we present a novel reordering approach utilizing sparse features based on dependency word pairs. Each instance of these features captures whether two words, which are related by a dependency link in the source sentence dependency parse tree, follow the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013